We study theoretical properties of regularized robust M-estimators,applicable when data are drawn from a sparse high-dimensional linear model andcontaminated by heavy-tailed distributions and/or outliers in the additiveerrors and covariates. We first establish a form of local statisticalconsistency for the penalized regression estimators under fairly mildconditions on the error distribution: When the derivative of the loss functionis bounded and satisfies a local restricted curvature condition, all stationarypoints within a constant radius of the true regression vector converge at theminimax rate enjoyed by the Lasso with sub-Gaussian errors. When an appropriatenonconvex regularizer is used in place of an l_1-penalty, we show that suchstationary points are in fact unique and equal to the local oracle solutionwith the correct support---hence, results on asymptotic normality in thelow-dimensional case carry over immediately to the high-dimensional setting.This has important implications for the efficiency of regularized nonconvexM-estimators when the errors are heavy-tailed. Our analysis of the localcurvature of the loss function also has useful consequences for optimizationwhen the robust regression function and/or regularizer is nonconvex and theobjective function possesses stationary points outside the local region. Weshow that as long as a composite gradient descent algorithm is initializedwithin a constant radius of the true regression vector, successive iterateswill converge at a linear rate to a stationary point within the local region.Furthermore, the global optimum of a convex regularized robust regressionfunction may be used to obtain a suitable initialization. The result is a noveltwo-step procedure that uses a convex M-estimator to achieve consistency and anonconvex M-estimator to increase efficiency.
展开▼